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Abstract 

The sphere decoder (SD) is an attractive low-complexity alternative 
to maximum likelihood (ML) detection in a variety of communication 
systems. It is also employed in multiple-input multiple-output (MIMO) 
systems where the computational complexity of the optimum detector 
grows exponentially with the number of transmit antennas. We propose 
an enhanced version of the SD based on an additional cost function de- 
rived from conditions on worst case interference, that we call dominance 
conditions. The proposed detector, the king sphere decoder (KSD), has 
a computational complexity that results to be not larger than the com- 
plexity of the sphere decoder and numerical simulations show that the 
complexity reduction is usually quite significant. 

1 Introduction 

Currently, system design for wireless communications assumes the presence of 
multiple antennas at both transmit and receive locations in order to meet the 
requirements for high data rate transmission (Tl. The main reason is found in the 
equivalent multiple-input multiple-output (MIMO) channel providing diversity 
and/or capacity gains to the system, where in the last case, compared to single- 
antenna systems, capacity is increased by a factor equal to the minimum number 
of transmit and receive antennas. 

The problem of (optimal) maximum-likelihood (ML) decoding in MIMO 
systems is known to be exponentially complex in the number of transmit anten- 
nas [2j3] . Various suboptimal algorithms have been developed as low-complexity 
alternatives to ML decoding, e.g. branch and bound techniques lattice- 
based approaches (5] and other tree-search algorithms as the A* algorithm j6j. 
A comprehensive study highlighting the connections among various approaches 
for low-complexity ML decoding in wireless communications is found in j7j. 

In the framework of communication and information theory, the term sphere 
decoder (SD) usually refers to a collection of extremely efficient algorithms based 
on number-theoretic tools, providing optimal or nearly-optimal solutions with 
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reduced average computational complexity with respect to the exhaustive search 
of standard ML decoding. Inspired from the work on vector search in lat- 
tices (8]|9], various SD algorithms have been proposed, e.g. for ML sequence 
estimation in channels with memory 



sional modulations in fading channels 
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and ML decoding for multidimen- 
SD has been then extended in the 



context of multiantcnna systems, both for uncoded and space-time coded trans- 
missions 12 . Description and performance comparison of different methods for 
SD-based ML decoding are found in |l3)[l4 : both works conclude that Schnorr- 
Euchner-based SD (SESD) outperforms other SD variants. Furthermore, the 
limitation of the algorithm to underloaded scenarios, i.e. with number of trans- 
mit antennas not exceeding the number of receive antennas, has been tackled 
in successive works dealing with optimal decoding in (underdetermined) over- 
loaded systems [15-17 . It is worth noticing that some works showed that the 
expected complexity of SD is polynomial for a wide range of number of an- 
tennas and signal-to-noise ratio (SNR) values 18 19 , however according to a 



more rigorous definition of expected complexity other works state that SD ex- 
hibits reduced (w.r.t. ML) exponential complexity 20 . Other SD algorithms 



approaching near-ML performance and suitable for implementation with very 



large scale integration (VLSI) architectures have been proposed in 21 



A different approach for ML decoding, based on dominance conditions, has 
been studied in 22 -24] for systems adopting BPSK or QPSK modulation, and 
then extended in |25| to arbitrary-size PSK modulation. Such an algorithm, 
namely king decoder (KD), provides the ML solution and thus it is optimal 
from the point of view of Symbol Error Rate (SER) performance. Two major 
advantages are: (i) no matrix inversion and/or factorization is needed; (ii) the 
same algorithm applies to both underloaded and overloaded systems. 

The main contribution of this paper is an enhanced version of SD, which is 
based on an additional cost function derived from dominance conditions, thus 
exploiting the properties of KD. The new algorithm presents a significantly 
reduced computational complexity, measured as the average number of visited 
nodes, w.r.t. the classic SD. 

The rest of the paper is organized as follows: in Section [2] we present the 
mathematical model for the system under investigation; Section [3] describes the 
SD; dominance conditions, representing the core of the improving innovation, 
are analytically studied in Section H] the proposed KSD for MIMO detection is 
described in Section [5] in Section [6] we show and compare the performance in 
terms of computational complexity obtained via numerical simulations; finally, 
concluding remarks are given in Section [7] 

Notation - Lower-case bold letters denote vectors, with a n denoting the nth 
entry of a; upper-case bold letters denote matrices, with a n ^ m and a m denoting 



respectively; £{•}, (•)*, (•) , 
(•) , and ||-|| 2 , denote expectation, conjugate, transpose, conjugate-transpose 
and squared Frobenius norm operators, respectively. 



the (n,m)th entry and the mth column of A. 
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2 System Model 



We consider a narrowband MIMO system with K transmit antennas and N 
receive antennas, described by the following vector model 

y = Hx + n, (1) 

where y £ C w is the received vector, whose entry ?/, represents the signal re- 
ceived by the ith receive antenna; H G C NxK is the channel matrix, whose 
entry hij represents the fading coefficient between the jth transmit antenna 
and the ith receive antenna; x <= C K is the transmitted vector, whose entry 
Xj represents the symbol transmitted by the jth transmit antenna; n G C N is 
the additive noise vector modeled according to a zero-mean complex Gaussian 
distribution with variance E|nn ff } = ?7oIjv- Transmitted symbols are drawn 
from a finite set of complex symbols \ which depends on the specific chosen 
modulation scheme. The channel vector from the kth transmit antenna is h^, 
i.e. the fcth column of channel matrix. Also, we assume perfect channel state 
information at the receiver. 

The problem of optimal decoding x from the knowledge of y is formulated 
as follows 

xml = arg min ||y - Hx|| 2 (2) 

x£X K 

where exhaustive search is apparently prohibitive for sizes of interest, thus the 
need for low-complexity alternatives. Assuming the constraint that the total 
average energy to be transmitted over the single symbol period cannot exceed 
E x , system performance are evaluated with respect to the SNR per single receive 
antenna, i.e. SNR = E x /rjo. 

It is worth noticing that other kinds of systems for multiuser communica- 
tions, such as direct-sequence code-division-multiple-access (DS-CDMA) (2] and 
multi-carrier code-division-multiple-access (MC-CDMA) (26] , share the same 
linear model with additive noise described by (TH). 



3 Sphere Decoder 

The idea of sphere decoding is to restrict the search to transmitted vectors 
whose received constellation counterparts are included in a hyper-sphere with 
radius r centered on the received signal y, that is 

||y-Hx|| 2 <r 2 . (3) 

If the sphere contains no vectors the algorithm either fails or restarts with an 
increased radius. In the latter case the result of the algorithm is always the 
optimal ML solution, obtained with reduced computational complexity when 
the number of vectors in the sphere is small compared to the overall number 
of possible transmitted vectors, i.e. |xl • The choice of the radius is crucial in 
order to obtain a computational complexity gain; in the ideal case the sphere 
should include just one vector. 

The test in |3| is efficiently performed by exploiting the QL (corresp. QR) 
factorization of the channel matrix H in terms of a unitary matrix Q (i.e. 
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Q ff Q = Ijy) and a lower-triangular matrix L (corresp. upper-triangular matrix 
R). In this case Q can be equivalently formulated as 



xml = arg min ||y - Lx| 



K 

arg min > 



3 = 1 



(4) 
(5) 



where y = Q T y. The QR factorization enables the test in pi to be formulated 
as a tree search with pruning u\ . In fact the summation in (I5f can be performed 
on a tree with K + 1 layers where the term 



(6) 



can be computed at each node of the layer i. The advantage of this formulation 
is that the partial distance in ([6| is always positive; this fact implies that the 
children nodes have always greater partial distances, i.e. the metric is said to be 
cumulative. Therefore at each node at layer i we can compute the accumulated 
partial distance 



E 

fe=i 



(7) 



and compare it with a threshold, corresponding to r 2 . The algorithm selects 
only the nodes leading to leaves that are within a sphere and at the same time 
computes the metric that will be used at the end to select the optimal solution. 
As stated before, if no leaves are contained in the sphere then the radius is 
increased and the search on the tree is restarted. 

There are two possible strategies to perform the tree search: the breadth-first 
search (BFS) and the depth-first search (DFS) |7 . In the breadth-first search, 
all surviving nodes of the same level are visited before moving to the next level, 
until the leaves are reached. In the depth-first, at each level only one node is 
visited, and following its child in K steps a leaf is reached. At this point the 
radius is updated and the algorithm proceeds with other nodes starting from 
upper levels. While in BFS the tree is traversed from top to bottom, in DFS 
the tree is traversed horizontally. In the latter case the algorithm can be started 
with an infinite radius as it can be updated as soon as the first leave is reached. 
The performance of the SD algorithm can be improved by choosing a proper 
enumeration order. In Fincke-Pohst enumeration |8] branches are enumerated 
in a natural fashion, while in Schnorr-Euchner enumeration (9j[l3] branches are 
selected in a zig-zag fashion for QAM constellations along each dimension 
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The computational complexity of the sphere decoding algorithm is measured 
by the average number of visited nodes needed to obtain Q 20 1. That figure is 
closely related to the time required by the algorithm to provide the solution and 
clearly related to the throughput that is achievable in currently available digital 
hardware 



27 



The computational cost of QL factorization is not considered 
here, since it is computed once for all and it represents a negligible factor in the 
overall complexity. 
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4 Dominance Conditions 



In the sphere decoding algorithm at each node the partial distance is checked in 
order to exclude some branches in the tree. Another condition can be derived 
from the Euclidean distance that can improve the computational complexity 
of the sphere decoder. In this section we derive a set of sufficient conditions 
that can be used to exclude some possible transmitted vectors from the set of 
candidates in the ML search. 

Geometrically the ML solution is given by the vector x that minimizes the 
Euclidean distance 

/ (x) = (y - Hx) fl (y - Hx) . (8) 

We first define the difference of the Euclidean distance between two generic 
points of x K ■ 

Definition 1. Given two generic vectors x and x, with {x, x} € x K i the discrete 
difference is defined as A/ (x; x) = / (x) — / (x) . 

Definition 2. The discrete difference related to vectors differing only in the 
fcth component is called kth discrete difference along the fcth coordinate and 
denoted Afe/(x;x). 

A necessary and sufficient condition for x to be a global minimum for the 
cost function / (x) is then that all discrete differences A/ (x; x) are non positive 
for each x € x ■ The search of the global minimum just by looking at the 
differences does not reduce the computational complexity of the ML search 
alone. The number of differences to compute is still exponential with the number 
of inputs and the size of the constellation. However, as it will be clearer in the 
following, we can avoid to look at all differences and still get the optimal solution. 

In the special case of the Euclidean distance the discrete difference along the 
generic fcth coordinate takes on a specific expression, as stated by the following 
proposition. 

Proposition 1. For any pair of vectors x and x that belong to x K an d differ 
only in the kth position 



A fe /(x;x) = -2${ {x k -x k ) 



hfy-^T^hf^ 



x k \ 2 -\x k \ 2 )h%h k . (9) 



Proof. See appendix [8] □ 

The fcth discrete difference in |9| depends on the observed vector y and on 
the symbols of the other elements of the input vector x, i.e. Xi, i ^ k. The 
sign of the discrete difference A k f (x; x) determines which of the two possible 
transmit vectors x and x is closer to the observation y. 

The discrete difference expression in ^ can be simplified if a constellation 
with constant modulus is employed, as the second term on the right hand side 
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of (|9l becomes zero. In this case the discrete difference reduces to 



A fc /(x;x) 



23? < (x k - Xk)* 



hfy-V^hfh,; 



/ ^"fc 

i^k 



(10) 



4.1 Dominance conditions for 4-QAM 

Since 4-QAM constellations are separable, we can equivalently consider a real- 
valued system model, whose dimensions are doubled, with binary signaling, i.e. 
X = {— 1, +1}- In the following, theoretical results will be derived referring to 
the real-valued system model. In this case, the fcth discrete difference is 



A k f (x; x) = -2 (x k - x k ) 



' Xih^hi 



i^k 



(11) 



Eq. (Ill can be used to make an optimal decision under the assumption that 



the contribution due to the other components of vector x are known. From (111 



a necessary condition can be derived for BPSK constellations as follows. The 
discrete difference is non positive when the two terms on the right-hand side of 



(111 have the same sign 



sign(a; fc - x k ) = sign 



i^k 



(12) 



Note however that in the binary case there exists only one adjacent point, i.e. 
Xk = —Xk, and the above equation can be written as 



sign (22^) = sign 



h fc y - x^lhi 

i^k 



(13) 



that can be equivalently rewritten as 



x k = sign 



h fc y - Y x * h k h * 



(14) 



From ( 14 1 we have that the ML solution must satisfy the set of equations 



„ML 



sign 



hly 



iyLk 



k = l,...,K 



(15) 



which provide the set of local minima of the Euclidean distance, thus represent- 
ing a necessary condition for the ML solution, as for these points all the fcth 
discrete differences are non positive. 

It is interesting to note that the same set of equations has been derived 
in the context of Hopfield neural network (HNN) [28 29 and applied to ML 
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decoding. In (30 31 , detectors for code division multiple access (CDMA) have 
been proposed for the first time and then the idea has been further developed 
in [32 ■ 35 . The Eq. ( 14 ) represents the discrete-time approximation of the 



equation of motion of neurons, as the metric of ML optimum detector can be 
mapped to the energy function of the HNN and the ML solution is the result 
of the dynamic update of (14 1 (see for example 35 and references therein). 



Therefore the search is based on a gradient descent algorithm that may not 
provide the exact ML solution, but rather only a local minimum. Furthermore 
when the updates of the discrete-time equations are done in parallel, the solution 
may also present limit cycles and no convergence to a fixed point 36 . In order to 



prevent the updating rule to enter a limit cycle and to force the dynamic update 
through increasing likelihood towards the global minimum, in |37| a modified 
HNN approach to ML decoding is proposed, leading to a family of likelihood 
ascent sub-optimal detectors (LAS). All these algorithms are sub-optimal and 
can approach optimal performances only under specific conditions. 



The necessary conditions in ( 15 1 suggest to restrict the search for the ML 
solution to the set of local minima. Unfortunately, no method is known to 
enumerate all equilibrium points, i.e. points that satisfy (15) with a compu- 
tational complexity that it is not exponential. However, we can still identify 
cases where the determination of the sign of the fcth discrete difference, i.e. the 
determination of the fcth component of local minima, can be made regardless 
of the contribution of all other components of x. A sufficient condition for the 
determination of the sign of the fcth discrete difference is given by the following 
proposition. 

Proposition 2. // the following condition is satisfied 



|hfcV| > 



E 



(16) 



then the sign of the corresponding kth discrete difference for BPSK constellation 
is determined regardless of the contribution of all other components of x. 



Proof. See appendix [9] 



□ 



Eq. ( 16 1 is a dominance condition because, when it holds, the fcth component 



of the projected received vector is so strong that dominates all other components. 
The dominance condition assumes that in ( 16 1 no symbols Xi, i ^ k, are known. 
However, in sequential decoding, partial knowledge may be available. In such 
cases the sign of the discrete difference depends only on the subset of Xi that are 
still to be decoded. A dominance condition when only a subset W of symbols 
is already available, can be given. 

Proposition 3. Given the set of known symbols W and a set of unknown 
symbols O , if the following condition holds 



IhJhJ 



> 2^ l n i n fc| 



(17) 



then the sign of the corresponding kth discrete difference for BPSK constellation 
is determined regardless of the contribution of all components o/x, Xi, i<EO. 
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Proof. Analogous to the proof of Prop. [2] 



□ 



Eq. (17 1 generalizes (16 I: if some antenna i is not dominant over his mul- 



tiantenna interference, it may happen that it is conditionally dominant, as the 
interference by the already known bits is canceled out. 

The sufficient condition in (16 1 was also derived in [33], where it has been 



used in a multiuser detection algorithm based on Hopfield Neural Networks. 



Eqs. (16 1 and (17) have also been used in [22] for maximum-likelihood sequence 
detection and then in [23] with a preprocessing algorithm for multiuser detec- 
tion. In 24 they have been used for a stand-alone tree-search algorithm for 
low-complexity ML detection in spatial multiplexing MIMO systems, the king 
decoder. 



Eqs. (16) and (17) are satisfied if the off-diagonal terms of the channel cor- 
relation matrix are small compared to the terms h^y , k = 1, . . . ,K. Whether 
the conditions are satisfied or not depends on the received vector y and on the 
structure of the channel or of the correlation channel matrix. 



4.2 Dominance conditions for M-QAM 

In the case of M-QAM constellations, dominance conditions can be expressed 
in terms of those for 4-QAM, when M = 2™ and n is an even number, e.g. 
16-QAM. In fact such QAM constellations can be written as weighted linear 
combination of n/2 4-QAMs 15 . For example, the 16-QAM transmit vector 
can be expressed as 

X = Xl + 2x 2 (18) 

where x l7 x 2 are 4-QAM vectors. Consequently, the system model (JTJ) can be 
written as 

Xl 

x 2 



y = [ H 2H ] 



n 



(19) 



which represents the equivalent model for 16-QAM MIMO systems with K 
transmit and N receive antennas in terms of 4-QAM MIMO system with 2K 
transmit and N receive antennas. Based on this equivalence, we can restrict 
our attention to 4-QAM MIMO systems without loss of generality. 



4.3 Dominance conditions for M-PSK 

It is possible to derive analogous dominance conditions in the general case of 
M-PSK, however in this case the real-valued system model does not hold. Dom- 
inance conditions based on the complex- valued system model have been derived 
and analyzed in 25 . Results are not reported here, as they are not necessary 



for popular systems supporting QAM. 



5 King Sphere Decoder 



The main contribution of this paper is the integration of the conditions (16) 



and (17) in any sphere decoding algorithm. The idea that we propose is to use 



the conditional dominance condition given by (17 1 at each node of the decoding 



tree in addition to the partial distance condition of the standard sphere decoding 
algorithm. The dominance conditions, when satisfied, allow to cut branches off 
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Figure 1: Tree-search algorithm for a system with N = 5 and K = 5 antennas. 
The transmitted bit vector is x = (1,-1,— 1,-1, 1) T . Paths with stars are 
provided by the dominance conditions alone, while the path with square nodes 
is the ML solution 



that cannot correspond to the optimal solution and then reduce the number 
of the visited nodes, i.e. the computational complexity of the search. The 
operation of the proposed algorithm is shown with the help of Fig. [I] that shows 
a decoding tree for a system with N = 5 and M = 5 antennas. At each node 
we can check whether (17 1 is satisfied or not. For example the node pointed by 
the arrow corresponds to the dominance condition 



|hjy - h%h 2 x 2 - h^h lXl \ xi=l x2=1 > 

|h^h 4 | + |hjh 5 | . (20) 

If the condition is satisfied then a decision on the corresponding bit can be made 
and only one of the two branches that departs from that node is selected, and 
half of child nodes can be cut off. In our example such condition is satisfied and 
a decision on bit 3 can be made: £3 = — 1, if x\ = 1 and x 2 = 1 or, equivalently, 
we can exclude all the vectors that have x\ = 1, x 2 = 1, X3 = 1. At the end we 
obtain a set of possible ML solutions, as shown in Fig. [T] where only 6 out of 
32 paths survive. 

The tree-search algorithm that makes use of the (conditional) dominance 
conditions alone has already been presented in [24][25] , where it has been called 
king decoder (KD). In general at the end of the tree-search the selection of the 
optimal solution is made among the survivors by computing the corresponding 
metric and then the last step of the search involves the computation of Euclidean 
distances for all survivors. In KD rather than compute the Euclidean distance 
at the end of the enumeration process, a different equivalent metric, that is 
cumulative and re- uses the computations done for dominance conditions, has 
been introduced [24] . 

We propose in this paper the inclusion of the dominance conditions as an ad- 
ditional step in a generic tree-search algorithm for ML decoding. For simplicity 
we restrict our attention to sphere decoding and we show that at the expenses 
of a marginal increase of computational complexity at each node, a significant 
reduction of the average number of visited node can be achieved. By integrating 
the dominance conditions into sphere decoding, we can exclude points that can- 
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Algorithm 1 Generic tree-search algorithm (adapted from [7 ) 

reset_tree() {initialize the tree} 
init_search() {ex. : reset partial distance} 
init_ACTIVE() {Create an empty list of active nodes} 
cn = root {current node (cn) is root} 
//- Main loop 

while cn is not empty do 
if cn is not a leaf then 
if cn is a valid node then 
get valid child nodes of cn 
sort valid child nodes 
insert valid child nodes in ACTIVE 
update node counter 
end if 
else 

select best node 
update bounding function 
end if 

get next node in ACTIVE 
end while 

// 

if best node is empty then 

restart with a reduced radius 
else 

get the ML solution corresponding to the best node 
end if 



not be ML solution before checking if they lie within the sphere. At the end of 
the tree-search the partial metric computation carried on by the sphere decoder 
can be used to select the optimal solution. We call this enhanced version of the 
sphere decoder, king sphere decoder (KSD). 

We consider the formulation of a generic tree-search algorithm based on the 
pseudo-code provided by Murugan et al. in |7 which describes a generic branch- 
and-bound algorithm. More generally in a tree-search algorithm at each node a 
decision is made based on a boolean condition that it is not necessarily expressed 
as a cost function compared to a bounding function, but as combination of 
several boolean conditions. 

In the algorithm, ACTIVE contains an ordered set of nodes to be visited. The 
data structure used to implement ACTIVE determines the traverse strategy in 
the tree. In case of BFS a queue data structure can be employed, while a stack is 
suitable for DFS. The algorithm starts with the initialization of the radius, that 
can also be infinite, as in DFS. The main loop visits each valid node of the tree, 
starting from the root. At each node, unless a leaf is reached, the (conditional) 
dominance is checked first. If it is satisfied then one of the two child nodes can 
be excluded and will not be visited, otherwise no action is taken. Then, for each 
child nodes that has not been excluded, the partial distance is computed and 
compared against to the current radius. At this point only nodes that lie within 
the partial distance will be considered valid nodes. Therefore for each node, in 
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general, a sub-set of child nodes are valid node and will be visited in the loop. 
Note that dominance conditions are applied to the current node and partial 
distances are computed on its child nodes only if they are not excluded by the 
previous check. If valid nodes that are generated from the current node need 
to be sorted, as for example in Schnorr-Euchner enumeration, a sort function is 
called before nodes are inserted in ACTIVE. 

If the current visited node is a leaf then according to the metric, that is 
cumulatively computed, then the best candidate can be chosen and, depending 
on the tree traversing strategy, the radius may be updated. If a BFS is employed 
then it may happen that no leaf nodes are available at the end of the main loop 
(there is no best node) and a new search must be performed with an increased 
radius. 

Note that the only required modification with respect to the sphere decoding 
algorithm is contained in the function that generates valid child nodes. 

Dominance conditions introduced in the king sphere decoder can be seen as 
new set of constraints reducing the number of points to be visited, and for which 
no partial distance needs to be computed, because the new algorithm discards 
paths that surely cannot be local minima and then cannot be the ML solution. 

As for sphere decoder, the advantage of the KSD is the expected large re- 
duction of the number of the visited nodes and then of surviving paths. In the 
best-case scenario, in every visited node the dominance condition is satisfied, 
and then the algorithm returns a unique solution that corresponds to the ML 
solution, and only M nodes are visited, regardless of the choice of radius. In 
general the number of visited nodes is greater than M because the condition 
in ( 17 1 is not always satisfied. In the worst-case scenario no dominant bit is 
found, and then no decrease in the number of visited nodes with respect to the 
original tree-search algorithm is achieved. While the added conditions might 
increase the computational complexity at each node, the average number of vis- 
ited nodes can only be decreased. Therefore the algorithm can only perform 
better in terms of computational complexity measured in terms of the average 
number of visited nodes at the expenses of increased computation at each node. 
In practical implementations this represents a good trade-off between speed, 
and then achievable throughput, and area on VLSI devices. 

The efficiency of the algorithm will depend on the structure of the channel, 
i.e. on the matrix H and is higher in those cases where the off-diagonals elements 
of the channel correlation matrix are relatively small. This might be the case 
of some correlative MIMO channel models that take into account correlation 
among transmit and receive antennas or keyhole channels |38] , 

Note that the dominance conditions do not require any matrix inversion or 
matrix factorization and can be employed unmodified both in underloaded and 
overloaded systems. 



6 Simulation Results 

The proposed algorithm has always optimal performances in terms of SER, by 
construction. Performances are then measured in terms of the average number 
of visited nodes. We have run Monte-Carlo simulations in order to verify the 
improvement that can be gained with our proposed algorithm as in the worst 
case scenario performances are the same as those of SD. 



11 



Simulation results are presented with reference to two typologies of wireless 
channels, with different mathematical structures in their channel matrices: (i) 
independent fading, where entries of the channel matrix are assumed to be i.i.d. 
according to a zero-mean complex Gaussian distribution with unit variance; (ii) 
correlated fading, where a Kronecker model is assumed to take spatial corre- 
lation into account [39]. More specifically, in the case of correlated fading we 
assume that the channel matrix follows the structure 



40 



H 



= 4 /2 g4 /2 , 



(21) 



where R/r and R# describe spatial correlation at transmit and receive locations, 
respectively, and G matches the independent fading structure. 

In Fig. [2] results from simulations are shown for MIMO systems with differ- 
ent number of transmit and receive antennas. Two MIMO channel models are 
considered. The first is the standard MIMO channel model where the channel 
matrix elements are drawn from a complex Gaussian distribution. The sec- 
ond model is the correlative MIMO channel where correlation between transmit 



antennas and between receive antennas as in (21) where, according the model 
proposed in ESI, we have 
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/ 



with pt and pn transmit and receive correlation indexes, respectively. Results 
are obtained for pt = 0.5 and pr = 0.5 and both SD and KSD, with deep first 
(DF) search strategy, in terms of the number of visited nodes averaged over the 
channel and noise realizations as well as the possible transmitted vectors [20] , 

Figs. [2j [3] and [4] show that in practice dominance conditions can effectively 
reduce the computational complexity of SD in all cases under consideration. The 
reduction is greater with correlated MIMO systems, suggesting that dominance 
conditions are more frequently satisfied in this case. 
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Figure 2: Average number of visited nodes as function of average signal-to-noise 
ratio. 4-QAM system with K = 2, N = 2. 

7 Conclusions 

We have proposed an enhanced version of the SD, namely KSD, that presents a 
lower computational complexity measured in terms of average number of visited 
nodes, w.r.t classic SD implementation. The reduction in complexity is possible 
because an additional cost function is considered in the standard tree-search 
based SD. The cost function is based on the dominance conditions that allows 
to take a decision when multiantenna interference is not too strong. There- 
fore the KSD has all the features of any SD algorithm and has always better 
performances. Numerical simulations show that for MIMO systems, both with 
independent and correlated fading statistics, the dominance conditions effec- 
tively reduce the computational complexity of the SD. 

8 Proof of proposition [l] 

We explicitly write the discrete difference as: 
A fe /(x;x) = 

- 23R { (x k - x k )* hf y } + x ff H fl Hx - x H H ff Hx (24) 

The term x ff H H Hx-x ff H fl Hx is a real scalar, so we can apply the conjugate- 
transpose operator with no change to obtain 
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Figure 3: Average number of visited nodes as function of average signal-to-noise 
ratio. 4-QAM system with K = 4, N = 4. 



x fl H fl Hx x" H n Hx 
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i j 

25R < (xk - x k )* 



hfy-^hfh, 

i^k 



+ (\x k \ 2 -\x k \ 2 )h*h k (25) 



By substituting ( 25 1 into ( 24 ) we can write the discrete difference as stated by 
the proposition. 



9 Proof of Proposition [2] 



The sign of the kth discrete difference for x (w.r.t. its unique adjacent vector 
x along the kth coordinate) is determined by (|14|, reported in the following for 



convenience: 



x k = sign 



i^k 



(26) 



When the sufficient condition (16 1 holds, the first term in r.h.s. of (26 1 is 



dominant over the sum representing the second term, independently on Xi,i 7^ k. 
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Figure 4: Average number of visited nodes as function of average signal-to-noise 
ratio. 16-QAM system with K = 2, N = 4. 



In such a case ( 26 1 reduces to 



.x fe = sign [h|y] 



that is the fcth discrete difference depends only on Xk, as stated by the propo- 
sition. 
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